tion of the k class of data points using the partitioning rule (x

ini index is defined as below,

ܫሺݔൌ߬ሻൌ෍݌ሺ߬ሻሾ1 െ݌ሺ߬ሻሿ

௞ୀଵ

(3.75)

be seen that, if a subspace is pure for one class, ݌ሺ߬ሻ = 1 and

ሻൌ0. Or, ݌ሺ߬ሻ = 0 and 1 െ݌ሺ߬ሻൌ1. Therefore, ݌ሺ߬ሻሾ1 െ

0. If all subspaces are pure for one class, ܫሺݔൌ߬ሻൌ0. If a

ng rule generates a random classification model, ݌ሺ߬ሻ = 0.5.

e of ܫሺݔൌ߬ሻ is 0.25 ൈܭ. Therefore, the minimum value of the

ex is zero when a set of partitioning rules constitute a perfect

tion model and the maximum value of the Gini index is 0.25K.

ther measurement is called the information gain based on the

heory, which is defined as below, where log݌ሺ߬ሻ has been

by logሺ1 ൅݌ሺ߬ሻሻ for avoiding an infinite value in case ݌ሺ߬ሻ

ching zero,

ܫሺݔൌ߬ሻൌ෍݌ሺ߬ሻlogሺ1 ൅݌ሺ߬ሻሻ

௞ୀଵ

(3.76)

nformation gain should be maximised for getting a better

ng rule. If a partitioning rule generates a pure subspace, ݌ሺ߬ሻൌ

ore, ݌ሺ߬ሻlogሺ1 ൅݌ሺ߬ሻሻൌ1. The maximum information gain

(a) (b)

The impurity measurement for partitioning rules applied to the data shown in

(a). (a) The Gini index. (b) The entropy index (information gain).